Structured Deep Neural Network Pruning via Matrix Pivoting
نویسندگان
چکیده
Deep Neural Networks (DNNs) are the key to the state-of-the-art machine vision, sensor fusion and audio/video signal processing. Unfortunately, their computation complexity and tight resource constraints on the Edge make them hard to leverage on mobile, embedded and IoT devices. Due to great diversity of Edge devices, DNN designers have to take into account the hardware platform and application requirements during network training. In this work we introduce pruning via matrix pivoting as a way to improve network pruning by compromising between the design flexibility of architecture-oblivious and performance efficiency of architecture-aware pruning, the two dominant techniques for obtaining resounce-efficient DNNs. We also describe local and global network optimization techniques for efficient implementation of the resulting pruned networks. In combination, the proposed pruning and implementation result in close to linear speed up with the reduction of network coefficients during pruning.
منابع مشابه
Large-scale Dictionary Construction via Pivot-based Statistical Machine Translation with Significance Pruning and Neural Network Features
We present our ongoing work on large-scale Japanese-Chinese bilingual dictionary construction via pivot-based statistical machine translation. We utilize statistical significance pruning to control noisy translation pairs that are induced by pivoting. We construct a large dictionary which we manually verify to be of a high quality. We then use this dictionary and a parallel corpus to learn bili...
متن کاملStructured Bayesian Pruning via Log-Normal Multiplicative Noise
Dropout-based regularization methods can be regarded as injecting random noise with pre-defined magnitude to different parts of the neural network during training. It was recently shown that Bayesian dropout procedure not only improves generalization but also leads to extremely sparse neural architectures by automatically setting the individual noise magnitude per weight. However, this sparsity...
متن کاملRethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers
Model pruning has become a useful technique that improves the computational efficiency of deep learning, making it possible to deploy solutions in resourcelimited scenarios. A widely-used practice in relevant work assumes that a smallernorm parameter or feature plays a less informative role at the inference time. In this paper, we propose a channel pruning technique for accelerating the computa...
متن کاملExploring the Regularity of Sparse Structure in Convolutional Neural Networks
Sparsity helps reduce the computational complexity of deep neural networks by skipping zeros. Taking advantage of sparsity is listed as a high priority in the next generation DNN accelerators such as TPU[1]. The structure of sparsity, i.e., the granularity of pruning, affects the efficiency of hardware accelerator design as well as the prediction accuracy. Coarse-grained pruning brings more reg...
متن کاملCompact Deep Convolutional Neural Networks With Coarse Pruning
The learning capability of a neural network improves with increasing depth at higher computational costs. Wider layers with dense kernel connectivity patterns furhter increase this cost and may hinder real-time inference. We propose feature map and kernel level pruning for reducing the computational complexity of a deep convolutional neural network. Pruning feature maps reduces the width of a l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1712.01084 شماره
صفحات -
تاریخ انتشار 2017